Let's load the dataset and take a look at the first few rows

After taking an initial look at the data, we can see that there are quite a few extraneous rows & columns that we do not need for our analysis.

Since we are trying to predict the population in 2020 (and then using that information to figure out how many electoral votes each state is going to get for the 2024 and 2028 elections), we'll only keep columns that contain population information for 2010 and 2019

Now we will predict the population for each state using the formula \begin{equation*}A=Pe^{rt}\end{equation*}

where \begin{equation*}t = 1\end{equation*}

\begin{equation*}r = (\frac{1}{9})(\frac{pop_{2019}}{pop_{2010}} - 1)\end{equation*}

\begin{equation*}P = pop_{2019}\end{equation*}

We will also account for the fact that the population estimates are as of July 1st of that year, whereas the Census estimate is based on April 1st numbers

Now that we have our 2020 Census Population estimate, we will use the Huntington-Hill method to allocate votes to each state. Each state starts off with one vote and the next state to receive a vote is the state with the highest priority number. The priority number of the nth state is equal to \begin{equation*}\frac{P_n}{\sqrt{v(v+1)}}\end{equation*}

where Pn is the population of that state and v is the number of votes that the state currently has.

After every round (where one vote has been allocated), the state which received the vote has its priority number recalculated. We will do this 435 times: 538 electoral votes - 100 Senate Votes - 3 votes for D.C.

If this number looks familiar, you are right; it is the number of seats in the House of Representatives (we are really just allocating House seats here). To test our code, we will apportion votes based on 2010 Census data and compare that to the actual allocation of votes that took place

After we have allocated all the votes, we will add 2 to each state's vote count (# of Senators) and find the difference between its 2010 and 2020 vote totals

Now let's create a new data frame with vote counts for each state. Then we will add each state's 2 letter code.

We will use the plotly graph object package to create a visual for our data

Now we will look at some election results from the last 50 years using the Plotly Express Choropleth package

To see how the distribution of electoral votes by state and region has changed over the course of American history, we will examine the same data set from the previous part

Now you may have noticed that the votes in the Electoral College are not allocated perfectly according to each state's population. Each state automatically starts off with 3 votes (2 Senate Votes + at least 1 House Seat). This is one point of criticism frequently made by detractors of the Electoral College. It undervalues the people living in states like California, Texas, New York, and Florida, while overrepresenting the people in Wyoming, Vermont, North Dakota, and Alaska.

To understand this disparity and get a visual of how bad it is, we will look at Census Data from 1960 all the way to our previously predicted 2020 figures.

Our first step is to load in the data set, get rid of the D.C. data, add 2 letter state codes,and add our predicted population numbers for 2020 to the data set.

Now we will do what we did earlier, allocating electoral votes to each state and doing this for every 10-year period. However, there is a catch: we will do this the normal way (each state's electoral votes being equal to the number of its Representatives + its number of Senators) and in a truly proportional way (allocate 537 votes among the 50 states).

Why 537? 538 total votes - 1 vote for D.C. (based on its population size)

Now we will do the exact same process using a true proportional allocation

To see the disparity between the largest and smallests states, we will calculate the population per electoral vote for each row.

To see how many votes each state is missing, we will take the difference of each state's actual electoral votes and its votes if they were allocated perfectly proportionally.

Now we will plot the difference in votes for each state over many decades

To get a better picture of how the Electoral College underrepresents some states and overrepresents others, let's create a visual that shows how many people are represented by one electoral college vote in each state.

Now let's do the same thing we did in the previous part, but this time, we will use a state's theoretical votes (if they were allocated perfectly proportionally). As you can expect, the disparity between a state like California and a state like Wyoming will not be as big in this scenario.

But even the maps we just created make it hard to discern the difference between states.

Let's visualize the same data in a more efficient manner by normalizing it; to do this, let's divide each state's population per electoral vote by the lowest population per electoral vote for that 10 year period (across all states).

So in 1960, Alaska will have a value equal to 1 (since it has the lowest number for this metric), while California will have a value closer to 5.

If you are having trouble with how to interpret this number, here is an example: In 1960, Alaska has a value of 1 and California has a value of 5.21. This means that each Alaskan vote is equal to 5.21 Californian votes. Alternatively, a person in Alaska has 5.21 times the voting power of a person in California.

Let's do the same thing for the votes that we allocated proportionally. Once again, states with smaller populations tend to have an advantage, but the differences are not as grotesque as the current system.

Now that we know that the Electoral College distorts votes, let's see if that distortion may have affected the results of any recent presidential elections. The election we will look at is the 2000 election, where 537 votes in the state of Florida gave George W. Bush the edge over Al Gore.

To see if the electoral college helped swing the election, we will find how many electoral votes each candidate would have received had all votes been allocated proportionally. For this theoretical vote count, we will use the "theoretical" allocation of seats after the 1990 Census (since the 2000 Census was used to decide the allocation of seats for 2004 and 2008).

Turns out Gore would have won the election under this scenario.